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METHOD AND SYSTEM FOR 
CONSTRUCTING A KNOWLEDGE PROFILE 
OF A USER HAVING UNRESTRICTED AND 

RESTRICTED ACCESS PORTIONS 
ACCORDING TO RESPECTIVE LEVELS OF 
CONFU)ENCE OF CONTENT OF THE 
PORTIONS 

FIELD OF THE INVENTION 

The present invention relates generally to the field of 
knowledge management and, more specifically, to a method 
and apparatus for automatically constructing a user knowl- 
edge profile and knowledge repository of electronic docu- 
ments. 

BACKGROUND OF THE INVENTION 

The new field of "knowledge management" (KM) is 
receiving increasing recognition as the gains to be realized 
from the systematic effort to store and export vast knowl- 
edge resources held by employees of an organization are 
being recognized. The sharing of knowledge broadly within 
an organization offers numerous potential benefits to an 
organization through the awareness and reuse of existing 
knowledge, and the avoidance of duplicate efforts. 

In order to maximize the exploitation of knowledge 
resources within an organization, a knowledge management 
system may be presented with two primary challenges, 
namely (1) the identification of knowledge resources within 
the organization and (2) the distribution and accessing of 
information regarding such knowledge resources within the 
organization. 

The identification, capture, organization and storage of 
knowledge resources is a particularly taxing problem. Prior 
art knowledge management systems have typically imple- 
mented knowledge repositories that require users manually 
to input information frequently into pre-defined fields, and 
in this way manually and in a prompted manner to reveal 
their personal knowledge base. However, this approach 
suffers from a nmnber of drawbacks in that the manual 
entering of such infiDrmation is time consiuning and often 
incomplete, and therefore places a burden on users who then 
experience the inconvenience and cost of a corporate knowl- 
edge management initiative long before any direct benefit is 
experienced. Furthermore, users may not be motivated to 
describe their own knowledge and to contribute documents 
on an ongoing basis that would subsequently be re-used by 
others without their awareness or consent. The manual input 
of such information places a burden on users who then 
experience the inconvenience and cost of a corporate knowl- 
edge management initiative long before any direct benefit is 
experienced. 

It has been the experience of many corporations that 
knowledge management systems, after some initial success, 
may fail because either compliance (i.e., the thoroughness 
and continuity with which each user contributes knowledge) 
or participation (i.e., the percentage of users actively con- 
tributing to the knowledge management system) falls to 
inadequate levels. Without high compliance and 
participation, it becomes a practical impossibility to main- 
tain a sufficiendy current and complete inventory of the 
knowledge of all users. Under these circumstances, the 
knowledge management effort may never offer an attractive 
relationship of benefits to costs for the organization as a 
whole, reach a critical mass, and the original benefit of 
knowledge management falls apart or is marginalized to a 
small group. 
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In order to address the problems associated with the 
manual input of knowledge information, more sophisticated 
prior art knowledge management initiatives may presume 
the existence of a centralized staff to work with users to 
5 capture knowledge bases. This may however increase the 
ongoing cost of knowledge management and requires a 
larger up-front investment before any visible payoff, thus 
deterring the initial funding of many an otherwise promising 
knowledge management initiatives. Even if an initial deci- 
sioD is made to proceed with such a sophisticated knowledge 
management initiative, the cash expenses associated with a 
large centralized knowledge capture staff may be liable to 
come vmder attack, given the difficulty of quantifying knowl- 
edge management benefits in dollar terms. 
15 As alluded to above, even once a satisfactory knowledge 
management information base has been established, the 
practical utilization thereof to achieve maximum potential 
benefit may be challenging. Specifically, ensuring that the 
captured information is readily organized, available, and 
20 accessible as appropriate throughout the organization may 
be problematic. 

SUMMARY OF THE INVENTION 
According to a first aspect of the invention, there is 
provided a method of constructing a user knowledge profile 
^ including first and second portions having different access 
restrictions. A confidence level is automatically assigned to 
content within an electronic document associated with a 
user, the content being potentially indicative of a user 
knowledge base. The content is then stored in either the first 
or the second portion of the tiser knowledge profile accord- 
ing to the assigned confidence level. 

According to a second aspect of the invention, there is 
provided apparatus for constructing a user knowledge profile 
25 including first and second portions having different access 
restrictions. The apparatus includes confidence logic to 
examine an electronic document, associated with a user, and 
to assign a confidence level to content within the electronic 
document, the content being potentially indicative of a user 
^ knowledge base. The apparatus further includes a profiler to 
store the content in either the first or second portion of the 
user knowledge profile according to the assigned confidence 
level. 

Other features of the present invention will be apparent 
45 from the accompanying drawings and from the detailed 
description that follows. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The present invention is illustrated by way of example 
and not limitation in the figures of the accompanying 
drawings, in which like references indicate similar elements 
and in which: 

FIG. 1 is a block diagram illustrating a knowledge man- 
agement system, according to an exemplary embodiment of 
the present invention. 

FIG. 2 is a block diagram illustrating a knowledge site 
management server, according to an exemplary embodiment 
of the present invention. 

FIG. 3 is a block diagram illustrating a knowledge access 
gQ server, according to an exemplary embodiment of the 
present invention. 

FIG. 4 is a block diagram illustrating a knowledge 
converter, according to an exemplary embodiment of the 
present invention. 
65 FIG. 5 is a block diagram illustrating a client software 
program, and an e-mail message generated thereby, accord- 
ing to an exemplary embodiment of the present invention. 
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FIG, 6 is a block diagram illustrating the structure of a 
knowledge repository, according to an exemplary embodi- 
ment of the present invention, as constructed from the data 
contained in a repository database and a user database. 

FIG. 7 is a flowchart illustrating a method, according to an ^ 
exemplary embodiment of the present invention, of con- 
structing a user knowledge profile. 

FIG. 8 is a flowchart iUustrating a high-level method, 
according to an exemplary embodiment of the present 
invention, by which terms may be extracted from an elec- 
tronic document and by which confidence level values may 
be assigned to such terms. 

FIG. 9 A is a flowchart illustrating a method, according to 
exemplary embodiment of the present invention, of deter- 
mining a confidence level for a term extracted from an 
electronic document. 

FIG. 9B is a flowchart illustrating a method, according to 
exemplary embodiment of the present invention, by which a 
document weight value may be assigned to a dociunent 20 
based on addressee information associated with the docu- 
ment. 

FIG. 10 illustrates a term-document binding table, accord- 
ing to an exemplary embodiment of the present invention. 

FIG. U illustrates a weight table, according to an exem- 25 
plary embodiment of the present invention. 

FIG. 12 illustrates an occurrence factor table, according to 
an exemplary embodiment of the present invention. 

FIG. 13 illustrates a confidence level table, including 
initial confidence level values, according to an exemplary 
embodiment of the present invention. 

FIG. 14 illustrates a modified confidence level table, 
including modified confidence level values, according to an 
exemplary embodiment of the present invention. 

FIG. ISA is a flowchart illustrating a method, according 
to an exemplary embodiment of the present invention, of 
constructing a user knowledge profile that includes first and 
second portions. 

FIG. 15B is a flowchart illustrating a method, according 40 
to an exemplary embodiment of the present invention, of 
storing a term in either a first or a second portion of a user 
knowledge profile. 

FIG. 16Afllustrates a user-term table, constructed accord- 
ing to the exemplary method illustrated in FIG. ISA. 45 

FIG. 16B niustrates a user-term table, constructed accord- 
ing to the exemplary method illustrated in FIG. ISA. 

FIG. 17A is a flowchart illustrating a method, according 
to an exemplary embodiment of the present invention, of 
facflitating access to a user knowledge profile, 

FIG. 17B is a flowchart illustrating an alternative method, 
according to exemplary embodiment of the present 
invention, of facflitating access to a user knowledge profile. 

FIG. 17C is a flowchart iUustrating a method, according 
to exemplary embodiment of the present invention, of per- 
forming a public profile process. 

FIG. 17D is a flowchart illustrating a method, according 
to an exemplary embodiment of the present invention, of 
performing a private profile process. gg 

FIG. 17E is a flowchart illustrating a method, according to 
an exemplary embodiment of the present invention, of 
performing a profile modification process. 

FIG. 18A is a flowchart iUustrating a method, according 
to an exemplary embodiment of the present invention, of 65 
addressing an electronic document for transmission over a 
computer network. 
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FIG. 18B is a flowchart illustrating a method, according 
to an exemplary embodiment of the present invention, of 
executing an "explain" function that provides the reasons for 
the proposal of an e-mail recipient. 

FIG. 18C is a flowchart illustrating a method, according 
to an exemplary embodiment of the present invention, of 
executing a "more" function that proposes further potential 
recipients for an e-mail message. 

FIG. 18D iUustrates a user dialog, according to an exem- 
plary embodiment of the present invention, through which a 
list of potential recipients is displayed to an addressor of an 
e-mail message. 

FIG. 19 is a flowchart iUustrating a method, according to 
an exemplary embodiment of the present invention, of 
managing user authorization to publish, or permit access to, 
a user knowledge profile. 

FIG. 20 is a flowchart iUustrating a method, according to 
an exemplary embodiment of the present invention, of 
assigning a confidence value, either in the form of a confi- 
dence level value or a confidence memory value, to a term. 

FIG. 21 is a flowchart iUustrating a method, according to 
an exemplary embodiment of the present invention, of 
determining or identifying a confidence value, either in the 
form of a confidence level value or a confidence memory 
value, for a term. 

FIG. 22 iUustrates a user-term table, according to an 
exemplary embodiment of the present invention, that is 
shown to include a confidence level value column, a con- 
fidence memory value column and a time stamp column. 

FIG. 23 is a block diagram iUustrating a machine, accord- 
ing to one exemplary embodiment, within which software in 
the form of a series of machine -readable instructions, for 
performing any one of the methods discussed above, may be 
executed. 

DETAILED DESCRIPTION 

A method and apparatus for constructing and maintaining 
a user knowledge profile are described. In the foUowing 
description, for purposes of explanation, numerous specific 
details are set forth in order to provide a thorough under- 
standing of the present invention. It will be evident, 
however, to one skflled in the art that the present invention 
may be practiced without these specific details. 

OVERVIEW 

With a view to addressing the above described difiSculties 
associated with manual knowledge capture either by a 
profile owner or by a dedicated staff, there is provided a 
method and apparatus for capturing knowledge 
automatically, without excessive invasion or disruption of 
normal work patterns of participating users. Further, the 
present specification teaches a method and apparatus 
whereby a database of captured knowledge information is 
maintained continuously and automatically, without requir- 
ing that captured knowledge information necessarUy be 
visible or accessible to others. The present specification also 
teaches faciUtating the user input and modification of a 
knowledge profile associated with the user in a knowledge 
database at the user's discretion. 

The present specification teaches a method and apparatus 
for intercepting electronic documents, such as for example 
e-mail messages, originated by a user, and extracting terms 
therefrom that arc potentially indicative of a knowledge base 
of the originating user. The extracted knowledge terms may 
then be utilized to construct a user knowledge profile. The 
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grammatical structure, length, frequency and density with struggle to cope with increasing volumes of daQy e-mail, 

which the extracted knowledge terms occur within elec- Further, when the time available to read e-mail messages 

tronic documents originated by a user, and prior history of becomes restricted, users typically begin to defer reading of 

use of the extracted knowledge terms within an organization e-mail messages, and communication eflSciency within the 
may furthermore be utilized to attach a metric, in the form 5 organization may be adversely affected. In the undcr- 

of a confidence level value, to the relevant knowledge terms distribution situation, it may occur that the proper recipients 

for the purpose of grouping, ranking, and prioritizing such of the message arc not included in the distribution list, and 

knowledge terms. Knowledge terms may furthermore be accordingly fall "out of the loop". 

stored in cither a private or public portion of the user There is also taught a method of fadhtating a user profile^ 
knowledge profile, depending upon the confidence level lO query or look-up wherein, in response to a match between a 

values thereof. query and a user profile, the owner of the user profile may 

It will be appreciated that the large volume of e-mail be prompted for authorization to publish all (or a portion) of 

messages traversing an e-mail system over a period of time the user profile to the originator of the query or to others 

will contain a large number of terms that may be irrelevant generally. This is advantageous in that it addresses the above 

to the identification of the knowledge base of a user. With a mentioned privacy concerns by treating the knowledge 

view to determining which terms are truly indicative of a profile as a confidential resource under the control of the 

knowledge base, a number of rules (or algorithms) may be user. The user is thus also able to control the timing, 

exercised with respect to extracted terms to identify terms circumstances and extent to which it is made accessible to 

that are candidates for inclusion within a public portion of others. A further advantage is that the user is prompted for 
the user knowledge profile. Further rules (or algorithms) 20 input specifically to satisfy specific, pending requests of 

may be appUed to an assembled knowledge profile for the others. This relieves the user of the need to remember to 

purpose of continually organizing and refining the profile. modify his or her profile on a regular basis and the need to 

Corporate e-mail systems have become increasingly decisions concerning the composition of the profile 

pervasive, and have become an accepted medium for idea prospectively, prior to any actual use of the profile by others, 

communication within corporations. Accordingly, the con- ^ manner the user saves time and effort, since the 

tent of e-mail messages flowing within a large organization determination that manual interaction with the profile is 

amounts to a vast information resources that, over the course necessary is a function of the present system, not a respon- 

of time, may directly or indirectly identify knowledge bases sibility of the user 

held by individuals within the organization. There is also taught a method of assigning a confidence 
The present specification also teaches addressing privacy level value to a term within an electronic document. This 
concerns associated with the examination of e-mail mes- confidence level value is based on a first quantitative 
sages for the above purposes by providing users with the indicator, derived from the number of occunences of the 
option selectively to submit originated e-mail messages for term within the electronic document, and a second charac- 
examinatioQ, or alternatively to bypass the examination and teristic indicator, derived utilizing the characteristic of the 
extraction system of the present invention. term. 
\V There is also taught a computer-implemented method and For the purposes of the present application, the word 
^ apparatus for addressing an electronic document, such as an "term" shall be taken to include any acronym, word, col- 
e-mail message, for transmission over a computer network. lection of words, phrase, sentence, or paragraph. The term 
The e-mail message may be examined to identify terms 40 "confidence level" shall be taken to mean any indication, 



therein. The identified terms are th en compared to a mim bcr 
of user faiowlcdge profiles with a view to detecting a 
predetermi ned degr ee of correspondence between the idcrT- 
tified "(cfms~and any one or naorc 01 the user knowled ge 
profiles. In the event that a prcdctermmed^de gKCoLcQr^- 

-Spon Sence^is dete cted^Uiesc nder of the ele ctroni c.documcnt 
isHprom pted to the eithe r accept or^ ecline-^e-pcQposed 

■"recipient as an actual recipient of t he electronic docume nt, 
after first being offeted'an opportunity to inspect the specific 
basis of the correspondence between the identified terms aiid 
the proposed recipients. The e-mail message may also be. 



numeric or otherwise, of a level within a predetermined 
range. 

SYSTEM ARCHITECTURE 

45 FIG. 1 is a block diagram illustrating a knowledge man- 
agement system 10, according to an exemplary embodiment 
of the present invention. The system 10 may conveniently be 
viewed as comprising a client system 12 and a sen/er system 
14. The cUent system 12 may comprise one or more clients, 
such as browser cUents 16 and e-mail clients 18, that are 
resident on terminals or computers coupled to a computer 
parsed to extract recipients entered manually by the user network. In one exemplary embodiment, each of the browser 
The degree of correspondence between the knowledge pro- clients 16 may comprise the Internet Explorer client devel- 
files of the manually entered recipients and the identified oped by Microsoft Corp. of Redmond, Wash., or the 
terms of the message is then optionally used as the basis of 55 Netscape Navigator client developed by Netscape Commu- 
recommendations to the user that certain manually entered nications of Menlo Park, Calif. Each of the e-mail clients 18 
recipients be dropped from the ultimate list of recipients. may further comprise the Outlook Express, Outlook 97, 
51 This aspect of the present teachings is advantageous in Outlook 98 or Netscape Communicator e-mail programs. As 
that a sender of an e-mail message is presented with a list of will be described in further detail below, the browser and 
proposed recipients, identified according to their knowledge 60 e-mail clients 16 are complemented by extensions 19, that 
profiles and the content of the e-mail message, who may be enable the e-mail clients 18 to send an electronic message 
interested in receiving the e-mail message. Accordingly, the (e.g., either an e-mail or HTML document) to a knowledge 
problems of over- distribution and under-distribution of server 22 implemented on the server side 14 of the system 
e-mail, messages that may be encountered within an orga- 10. As shown in FIG. 1, the extensions 19 may be integral 
nization may be reduced. Specifically, in the ■ over- 65 with an e-mail client 18, or external to the client 18 and in 
distribution situation, many users arc frequently copied on communication therewith. The clients 16 and 18 may default 
e-mail messages, resulting in lost productivity as the users to sending every communication to a relevant component of 
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the knowledge server 22, while allowing a user specifically gated to the e-mail server 23 from an e-mail client 18 via the 

to designate a communication not suitable for transmission Simple Mail Transfer Protocol (SMTP), as shown at 32. 

to the knowledge server 22. The user designation may be Alternatively, knowledge discovery may be implemented by 

facilitated through controls that are installed as software the examination of submissions from a browser client 16 via 

modules which interact with or modify an e-mail client 18, 5. the web server 20. 

and which cause messages to be copied to a special e-mail (^i^^^ knowledge server 22 includes the knowledge access 

address (e.g., a Knowledge Server (KS) mailbox 25 main- server 26 and the knowledge site management server 27 as 

tained by a e-mail server 23) associated with a knowledge t^o separate and distinct server systems in view of the 

• server component. Id the case where a client extension 19 divergent functions provided by the servers 26 and IT 

for performing this automatic transmission is not available, lO Specifically, the knowledge site managcrnent server 27 
the user can manually add the e-maH address of the KS ^''"^^'p ^ 51^°^^^ non-mteractive processing 

... . 1- * r • • . r *i. c (cg., tb© cxtraction of knowledge from mbound e-mail 

mailbox 25 to the hst of recipients for the message. Further ) ^ ii^ormation database 56A, 

details m this regard are provided below. Files embedded implement various centralized system management 

withm an e-mail message, such as attachments, may also be processes. The knowledge site management server 27 does . , 
selectively included or excluded from the capture process 15 communicate interactively with cUents 18, or with. I 

and rnay also be selectively included or excluded from ^^^^^ except for administrative functions. The knowl- ^ 

retention in a knowledge repository. edge access server 26, on the other hand, fuoctions primarily 

The browser clients 16 are used as an additional means to to respond to queries and updates froni_users submitted via 

submit documents to the knowledge server 22 at the discre- clients; typically browser^lientsIl'6rKlulfipirinstances of a 

tion of a user. The browser client 16 is used to access an 20 knowledge access server 26 may be required to support a 

interface application 34, maintained on a web server 20, large corporate environment and to provide appropriate 

which transmits documents to the knowledge server 22. scalability; however only one knowledge site management 
, , , , ^ -r^'Vserver 27, one user database 5 6 A, and one repository data- 

In alternate embodiments, a client may also propagate a j^-^^^ .^^ ^^^^ ^ ^^^^ ^ ^^^^^ 

hst of bookmarks, folders or directones to the knowledge environments, the web server 20, knowledge access server 

server 22 for the purpose of user knowledge profile con^ 26, and knowledge site management server 27, and even the 

struction. e-mail server 23, may all optionally be deployed on the same 

SERVER SIDE ARCHrrECTURE . „ ■ 

FIG. 2 IS a block diagram lUustratmg an exemplary 

The server side 14 of the system 10 includes the web 30 embodiment, according to the present invention, of the 

server20, the e-mail server 23 and the knowledge server 22. knowledge site management server 27. The server 27 is«h 

The web server 20 may be any commercially available web shown to include a socket front-end 40 to facilitate com- 

server program such as Internet Information Server (IIS) mimication with the web server 20 for administrative 

from Microsoft Corporation, the Netscape Enterprise Server, requests, a request handler 44, a knowledge gathering sys- 

or the Apache Server for UNIX. The web server 20 includes 35 tem 28, a knowledge converter 24, and a variety of special- 

the interface application 34 for interfacing with the knowl- ized controller modules 45A-45C. The request handler 44, 

edge server 22. The web server 20 may run on a single upon receiving a request from the web server 20 via the 

machine that also hosts the knowledge server 22, or may interface application 34 and socket front-end 40, starts a 

alternatively run along with the interface application 34 on session to process the request such as, for example, a request 

a dedicated web server computer. The web server 20 may 40 by an authorized systems administrator to configure the 

also be a group of web server programs running on a group behavior of the knowledge gathering system 28. 
of computers to thus enhance the scalability of the system The knowledge gathering system 28 is shown in FIG. 2 to 

10. As the web server 20 facilitates access to a local view of include an extraction controller 47, a mail system interface 

a knowledge repository 50, maintained by the knowledge 42, and a term extractor 46 including confidence logic 45. 

access server 26, by the browser clients 16, the web server 45 The extraction controller 47 commands the mail system 

interface application 34 implements knowledge application interface 42 to retrieve messages submitted by the e-mail 

interfaces, knowledge management interfaces, user profile client extensions 19 to the KS mailbox 25 on the e-mail 

creation and maintenance interfaces, and a server manage- server 23 for the purpose of extraction and processing. The^ 

ment interface. The web server 20 also facilitates knowledge extraction controller 47 can request this continuously or 

profile queries, e-mail addressing to an e-mail client 18, and 50 periodically on a scheduled basis, so that messages can be 

any other access to the knowledge server 22 using the processed at a convenient time when computing resources 

standard HTTP (web) protocol. are lightly loaded, for example, overnight. The mail system 

The knowledge server 22 includes a knowledge site interface 42 retrieves e-mail messages from the e-mail 

management server (KSMS) 27 and the knowledge access server 23 using the Simple Mail Transfer Protocol (SMTP), 
server (KAS) 26. The knowledge server access 26 includes ss Post Office Protocol 3 (POP3), or Internet Message Access 

an interface that provides a local view of a knowledge Protocol 4 (IMAP4) protocols. The mail system interface 42 

repository 50, which is physically stored in the user database propagates electronic documents directly to a term extractor 

56A and a repository database 56B. The knowledge site 46, including confidence logic 45, that operates to convert 

management server 27 is shown to have access to the local electronic documents into per-user knowledge profiles that 
view of the knowledge repository 50 maintained by the 60 are stored in a knowledge repository 50. The term extractor-* 

knowledge access server 26. The illustrated components of 46 may include any commercially available term extraction 

the knowledge server 22 are collectively responsible for the engine (such as "NPTOOL" from LingSoft Inc. of Helsinki, 

capture (termed "knowledge discovery") of terms indicative Finland, or "Themes" from Software Scientific) that ana- 

of a user knowledge base and for the distribution of user lyzes the electronic document, recognizes noun phrases in 
knowledge profile information. Knowledge discovery may 65 the document, and converts such phrases to a canonical form 

be done by the examination and processing of electronic for subsequent use by the confidence logic 45 as candidate 

documents, such as e-mail messages, which may be prop a- terms in a knowledge profile. 



10/01/2002, EAST Version: 1.03.0007 



6,115,709 

9 10 

The term extractor 46 performs a variety of the Steps when troller 45B manages the "recalculation" of profiles. The 
parsing and decoding an electronic document, such as inter- actual operation is performed within the knowledge access 
pre ting any special attributes or settings encoded into the server 26. which has a knowledge repository 50 interface, 
header of the message of the e-mail client 18, resolving the \jit/A case controller 45 A keeps track of open cases and 
e-mail addresses of recipients against either the buiU-in user 5 initiates notifications to users concerning their status. A 
database or an external user database, preprocessing the "case" is a pending request from one user to another, as will 
electronic document, extracting noun-phrases from the text be detailed below. For example, if a user requests an expert 
as candidates for knowledge terms, processing these knowl- in a certain field via a client browser client 16, the knowl- 
edge terms, and storing summary information about the edge access server 26 matches the term against both the 
document and extraction process in the databases 56A and public and private portions of all user profiles. If a high 
56B. The term extractor 46 further detects and strips out confidence, but private, match is found, the system cannot 
Don-original texts, attachments and in some cases the entire reveal the identity of the matched person to the inquirer and 
electronic document based on the document not meeting must therefore open a "case". The case places a notification 
predetermined minimum criteria. Further details regarding ^ *he profile "home" page of the target user and/or transmits 
the exact procedures implemented by the term extractor 46 ^'^^^ ^^fgp ^ link back to that page. The target 
will be provided below Once the term extractor 46 has ^^ej may thea (via a browser): 

extracted the knowledge terms, the knowledge repository 50 ^ |^ ^^^^^f f^^^ "^^^^^^^ ^"^.^^ ^^^'^ 

J 4 J 0 n ^^ * AA A A 2. See comments added by the mqiurer. 

IS updated SpecificaUy, new terms are added and repetitions 3 ^^.^^ ^.^^ ^ ^^^^^ 

of known terms are used to update the knowledge repository 4 ^ ^^^^^^^ ^^^^^ ^^^^^^^ p^^^^ 

^ 20 based on that term. 

p] The knowledge repository 50 is defined by a hierarchical 5, Qo into the profile and edit the term responsible for the 

v. structure of classes. The objects of these classes represent match. 

the knowledge information that includes, inter alia, user 5. Indicate that the case is accepted and provide authoriza- 

profiles (including knowledge profiles) and organizational tion to reveal the identity of the target to the inquirer. k 

structure, and are stored in two databases: the user database 2s ^ From the perspective of the inquirer, private matches are-. ^ 

56 A and the repository database 56B. The repository data- i^^initially returned with a match strength only and do not ] 

base 56B contains profile and repository information and reveal the name of the person or document matched. The I 

can use one of a number of commercial relational database user can then initiate cases for any or all of these private \ 

management systems that support the Open DataBase Con- matches, based on how urgently the information is needed, \ 

oectivity (ODBC) interface standard. A database interface 30 how good the matches were, and whether the public matches ) 

54 provides a logical database-independent class API to are sufficient. Each case gets an expiration date set by the J 

access the physical databases and to shield the complete inquirer and notification options regarding how the inquirer 

server codes from accessing database native API so that the wants to be told about the disposition of the case. Open cases 

server process can use any relational database management are summarized in the Web area for the inquirer, along with 



system (RDMS). Because the repository database 56A is 
open to inspection by systems administrators, and may be 
hosted on an existing corporate system, special measures 
may be taken to enh ance the p rivac y of iriFoSatiog in the 



35 the date and query that generated the return values. If the 
target denies a case, that status is communicated to the user. 
The user has no option to send e-mail or otherwise further 

^ identify that person. If the target accepts the case, the 

- rtspoS itoty^databaselggB^^for example, the repository data--! identity of the target is communicated to the user by updat- 
base 56B contains no actual user names or e-mail addresses, 40 ing the case record and the case is closed. Case history 
but instead may use encrypted codes to represent users in a retention options are a site administration option, 
manner that is meaningful only in combination with the user pjo. 3 is a block diagram illustrating the components that 
database. The user database 56A is a small commercial VJ^constitute the knowledge access server 26. The knowledge 
RDBMS embedded into the knowledge repository 50 in access server 26 is shown to include a socket front-end 40 
such a way that it cannot be accessed except through the 45 to facilitate commxmication with the web server interface 
interfaces offered by the system 10. The user database 56A application 34. The knowledge access server 26 further 
contains encrypted identifying codes that allow the names of includes a request handler 44^, a term extractor 46, a knowl- 
actual users to be associated with e-mail addresses, login edge repository 50 and a database interface 54 that function 
IDs, passwords, and profile and repository information in the a manner similar to that described above with reference to 
repository database. 50 the knowledge gathering system 28. Tbe term extractor 46 

A lexicon controller 45C is responsible for building tables^ i ncludes comparison logic 51, th e functioning of which will 
of associated terms. Terms are considered "associated" with I be described t)elow. 'i he knowledge access server 26 func- 
each other to the extent that they tend to co-occur in close | tions primarily as an interface between knowledge users and 
proximity within the documents of multiple users. The / the knowledge repository 50. It provides services to the web 
lexicon controller 45C manages the background process of (55 server interface application 34, which implements a nimiber 
data mining that is used to discover associations between I of user interfaces as described above for interacting with the 
terms and record those in special association tables withinj^ knowledge repository 50. 

the repository database 56B. ^ FIG. 4 is a block diagram illustrating the components that 

A profile controller 45B is a module that may optionally constitute the knowledge converter 24. Hie knowledge 
be included within the knowledge site management server 60 converter 24 is shown to include a term extractor 46 that is 
27, and manages a queue of pending, compute-intensive fed from an array of format converters 60. The knowledge 
operations associated with updating profiles. Since the algo- converter 24 is able to access the knowledge repository 50, 
rithm for the confidence level value calculation of a term and to import data fi:om other knowledge systems, or export 
(embodied in the confidence logic 45) depends on the total knowledge to other knowledge systems, via each of the 
number of documents profiled, the confidence level value 65 format converters 60. 

for each and every term in a user's profile is technically Returning to FIG. 1, the knowledge access server 26 

obsolete when any document is profiled. The profile con- implements the interface to the knowledge repository 50 and 
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the knowledge site management server 27 is shown to access 
the knowledge repository 50 via the knowledge access 
server 26. FIGS. 3 and 4 illustrate data for the knowledge 
repository 50 as residing in databases 56 A and 56 B. The 
databases 56A and 56B are built on a general database 5 
interface 54 and provide persistent storage for the core 
system classes referred to above. In one exemplary embodi- 
ment of the present invention, the user database and the 
repository databases are implemented utilizing the 
Microsoft SQL server, developed by Microsoft Corp. of lo 
Redmond Wash., to provide default storage management 
services for the system. However, programming may be 
done at a more general level to allow for substitution of other 
production class relational database management systems, 
such as those developed by Sybase, Oracle or Informix. 15 

CUENT SIDE ARCHITECTURE 

FIG. 5 is a diagrammatic representation of a client, 
according to an exemplary embodiment of the present 
invention, in the form of an e-mail cUent 18. It will be 
appreciated that the e-mail client 18 may be any commer- 
cially available e-mail client, such as a Microsoft Exchange, 
Outlook Express, Outlook 97/98 or Lotus Notes client. The 
e-mail client 18 includes modifications or additions, in the 
form of the extensions 19, to the standard e-mail client to 
provide additional functionality. Specifically, according to 
an exemplary embodiment of the present invention, three 
subsystems are included within the e-mail client extensions 
19, namely a user interface 80, a profiUng system 82, and an 
addressing system 84. 

The profiHng system 82 implements properties on an 
originated message, as well as menu and property sheet 
extensions at global and message levels for users to set and , 
manipulate theSC ucw piuptilit-s. Mum specifically, profiling 3^ 
system 82 provides a user with a number of additional 
options that determine how a message 85 propagated from 
the e-mail cUent 18 to the knowledge repository 50 will be 
processed and handled for the purposes of knowledge man- 
agement. A number of the provided options are global, while ^ 
others apply on aper-message basis. For example, according 
to one exemplary embodiment, the following per-message 
options (or flags) may be set by a user to define the 
properties of an e-mail message: 

1. An "Ignore" flag 86 indicating the e-mail message 45 
should not be processed for these purposes of con- 
structing or maintaining a user knowledge profile, and 
should not be stored. 

2. A "Repository" parameter 88 indicating that the mes^ 
sage may be processed for the purposes of constructing 
a knowledge profile and then stored in the repository 50 
for subsequent access as a document by others. The 
"Repository" parameter 88 also indicates whether the 
document (as opposed to terms therein) is to be stored 
in a private or pubUc portion of the repository 50. 55 

A number of glo bal message options may also be made"* 
available to a user for selection. For example, an e-mail 
address (i.e., the KS mailbox 25 or the e-mail server 23) for 
the knowledge server 22 may be enabled, so that the e-mail 
message is propagated to the server 22. " 

Actual implementation and presentation of the above 
per-message and global options to the user may be done by 
the addition of a companion application or set of software 
modules which interact with API's provided by e-mail 
clients, or modules which modify the e-mail client itself, 65 
which are available during message composition. If the user 
activates the Ignore flag 86, the profiling system 82 will not 



50 



make any modifications to the message and no copy of the 
message wiU be sent to the knowledge gathering system 28 
via the KS mailbox 25. Otherwise, per-message options, 
once obtained from the user, are encoded. Subsequently, 
when the user chooses to send the message 85 using the 
appropriate control on the particular e-mail client 18, the 
e-mail address of the knowledge gathering server is 
appended to the bHnd copy fist for the message. The 
profihng system 82 encrypts and encodes the following 
information into the message header, for transmission to and 
decoding by the knowledge gathering system 28, in accor- 
dance with Internet specification RFC 1522: 

1. The fist of e-mail addresses in the "to:" and "cc:" lists; 

2. Per-message options as appropriate; and 

3. For those recipients suggested by the addressing system 
84 (see below), a short list of topic identifiers including 
the primary topics found within the message and the 
primary topics found vyithin the user profile that formed 
a basis of a match. - 

4. Security information to validate the message as authen- ^ 
tic. 

When the message 85 is sent over the normal e-mail 
transport, the foUowing events occur: 

1. Recipients on the "to:" and "cc:" lists w ill receive a 
normal message with an extra heaaer containing the 
encoded and encrypted options. This header is nor- 
mally not displayed by systems that read e-mail and can 
be ignored by recipients; 

2. The recipients will not be aware that the knowledge 
gathering system has received a blind copy of the 
message; and 

3. If the sender chooses to archive a copy of the message 
85, the e-mail address of the knowledge gathering 
system 28 will be retained in the "bcc" field as a 
reminder that the message was sent to the knowledge 
gathering server. 

Further details concerning the addressing system 86 will 
be discussed below. 

THE REPOSITORY 

FIG. 6 is a block diagram illustrating the structure of the 
repository 50, according to one exemplary embodiment of 
the present invention, as constructed from data contained in 
the repository database 568, and the user database 5 6 A. The 
repository 50 is shown to include a number of tables, as 
constmcted by a relational database management system 
(RDBMS), Specifically, the repository 5 0 includes a user 
ta ble 90, a term table 100, a document tabl6l06^ user -term 
taSle 112, a term-document table 120 and a user-docu3ent 
table 130. The \iser table 90 stores information regarding 
users for whom knowledge profiles may be constructed, and 
includes an identifier column 92, including unique keys for 
each entry or record within the table 90. A name column 94 
includes respective names for users for whom knowledge 
profiles are maintained within the repository 50. A depart- 
ment column 96 contains a description of departments 
within an organization to which each of the users may be 
assigned, and an e-mail column 98 stores respective e-mail 
addresses for the users. It will be appreciated that the 
illustrated columns are merely exemplary, and a number of 
other columns, storing further information regarding users, 
may be included within the user table 90. 

The term table 100 maintains a respective record for each 
term that is identified by the tf-gri ^'V^^clnr -^^ within an 
electronic document, and that is included within the reposi- 
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tory 50. The terra table 100 is shown to iaclude an identifier 
column 102, that stores a unique key for each term record, 
and a term column 104 within which the actual extracted and 
identified terms are stored. Again, a number of fiirther 
columns may optionally be included within the term table 
100. The document table 106 maintains a respective record 
for each document that is processed by the term extractor 46" 
for the purposes of extracting terms therefrom. The docu- 
ment table 106 is shown to include an identifier column 108, 
that stores a unique key for each document record, and a 
document name column 110, that stores an appropriate name 
for each document analyzed by the term extractor 46. 

The user-term table 112 links terms to users, and includes 
at least two columns, namely a user identifier column 114, 
storing keys identifying users, and a term identifier column 
116, storing keys identifying terms. The user-term table 112 
provides a many-to-many mapping of users to terms. For 
example, multiple users may be associated with a single 
term, and a single user may similarly be associated with 
multiple terms. The table 112 further includes a confidence 
level coliman 118, which stores respective confidence level 
values, calculated in the manner described below, for each 
user-term pair. The confidence level value for each user-term 
pair provides an indication of how strongly the relevant term 
is coupled to the user, and how pertinent the term is in 
describing, for example, the knowledge base of the relevant 
user. 

The term-document table 120 links terms to documents, 
and provides a record of which terms occurred within which 
document. Specifically, the term-document table 120 
includes a term identifier column 122, storing keys for 
terms, and a document identifier column 124, storing keys 
for documents. The table 120 further includes an adjusted 
count column 126, which stores values indicative of the 
number of occurrences of a term within a document, 
adjusted in the manner described below. For example, the 
first record within the table 120 records that the term 
"network** occurred within the document "e-mail 1" 2.8 
times, according to the adjusted count. 

The user-document table 130 links documents to users, 
and includes at least two columns, namely a user identifier 
column 132, storing keys identifying users, and a document 
identifier column 134, storing keys identifying various docu- 
ments. For example, the first record within the exemplary 
user-document table 130 indicates that the user "Joe" is 
associated with the document "e-mail 1". This association 
may be based upon the user being the author or recipient of 
the relevant document. 

IDENTIFICAHON OF KNOWLEDGE TERMS 
AND THE CALCULATION OF ASSOCIATED 
CONHDENCE LEVEL VALUES 

FIG. 7 is a flow chart illustrating a method 140, according 
to an exemplary embodiment of the present invention, of 55 
constructing a user knowledge profile. FIG. 7 illustrates 
broad steps that are described in further detail with reference 
to subsequent flow charts and drawings. The method 140 
commences at step 142, and proceeds to decision box 144, 
wherein a determination is made as to whether an electronic 60 
document, for example in the form of an e-mail propagated 
from an e-mail client 18, is indicated as being a private 
document. This determination may be made at the e-mail 
client 18 itself, at the e-mail server 23, or even within the 
knowledge site management server 27. This determination 65 
may fiirthcrmore be made by ascertaining whether the 
Ignore flag 86, incorporated within an e-mail message 85, is 
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set to indicate the e-mail message 85 as private. As discussed 
above, the Ignore flag 86 may be set at a users discretion 
utilizing the profiling system 82, accessed via the user 
interface 80 within the extensions 19 to the e-mail cHent 18. 
In the event that the electronic document is determined to be 
private, the method 140 terminates at step 146, and no 
further processing of the electronic document occurs. 
Alternatively, the method 140 proceeds to step 148, where 
confidence level values are assigned to various terms within 
the electronic document. At step 150, a user knowledge 
profile is constructed utilizing the terms within the electronic 
document to which confidence level values were assigned at 
step 148. The method 140 then terminates at step 146. 

FIG. 8 is a flow chart illustrating a high-level method 148, 
according to an exemplary embodiment of the present 
invention, by which terms may be extracted from an elec- 
tronic document, and by which confidence level values may 
be assigned such terms. The method 148 comprises two 
primary operations, namely a term extraction operation 
indicated at 152, and a confidence level value assigning 
operation, indicated at a 154. The method 148 implements 
one methodology by which the step 148 shown in HG. 7 
may be accomplished. The method 148 begins at step 160, 
and then proceeds to step 162, where an electronic 
document, such as for example an e-mail, a database query, 
a HTML document and or a database query, is received at the 
knowledge site management server 27 via the mail system 
interface 42. For the purposes of explanation, the present 
example will assume that an e-mail message, addressed to 
the KS mailbox 25, is received at the knowledge site 
management server 27 via the mail system interface 42, 
from the e-mail server 23. At step 164, terms and associated 
information are extracted from the electronic document. 
Specifically, the e-mail message is propagated from the mail 
system interface 42 to the term extractor 46, which then 
extracts terms in the form of, for example, grammar terms, 
noun phrases, word collections or single words from the 
e-mail message. The term extractor 46 may further parse a 
header portion of the e-mail to extract information therefrom 
that is required for the maintenance of both the repository 
and user databases 56B and 56A. For example, the term 
extractor 46 will identify the date of transmission of the 
e-mail, and all addressees. The term extractor 46 will 
additionally determine further information regarding the 
electronic document and terms therein. For example, the 
term extractor 46 will determine the total number of words 
comprising the electronic document, the density of recurring 
words within the document, the length of each term (i.e., the 
number of words that constitute the term), the part of speech 
that each word within the document constitutes, and a word 
type (e.g., whether the word is a lexicon term). To this end, 
the term extractor 46 is shown in FIG. 2 to have access to 
a database 49 of lexicon terms, which may identify both 
universal lexicon terms and environment lexicon terms 
specific to an environment within which the knowledge site 
management server 27 is being employed. For example, 
within a manufacturing environment, the collection of envi- 
ronment lexicon terms will clearly differ from the lexicon 
terms within an accounting environment. 

FoUowing the actual term extraction, a first relevancy 
indicator in the form of an adjusted count value is calculated 
for each term within the context of the electronic document 
at step 168. At step 170, a second relevancy indicator in the 
form of a confidence level is calculated for each term within 
the context of multiple electronic documents associated with 
a particular user. Further details regarding steps 168 and 170 
are provided below. Hie method 148 then terminates at step 
172. 
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FIG. 9A is a flow chart iUustrating a method 154, accord- a term of unkcown granmiatical structure for a given length, 

ing to an exemplary embodiment of the present invention, of The entries within the length column 216 indicate the 

determining a confidence level for a term extracted from an number of words included within the term. The entries 

electronic document. Following the commencement step within the Part of Speech column 218 indicate the parts of 

180, a term and associated information is received at the 5 speech that the words within a term comprise. The "A^' 

confidence logic 45, included within the term extractor 46. indication identifies the adjectives, the "V" indication iden- 

While the confidence logic 45 is shown to be embodied in tifies a verb, the "N" indication identifies a noun, and the 

the term extractor 46 in FIG. 2, it will be appreciated that the "X" indication identifies an unknown part of speech. By 

confidence logic 45 may exist independently and separately mapping a specific term to an appropriate entry within the 

of the term extractor 46. In one embodiment, the associated 10 weight table 210, an appropriate term weight value, as 

information includes the following parameters: indicated in the weight column 212, may be assigned to the 

1. A count value indicating the number of occurrences of term. 

the term within a single electronic document under At step 188, a relevancy quantitative indicator in the form 

consideration; of an adjusted count value for each term, is calculated, this 

2. A density value, expressed as a percentage, indicating 15 adjusted count vahie being derived from the binding strength 
the number of occurrences of the term relative to the total and term weight values calculated at steps 184 and 186. 
number of terms within the electronic document; While this determination may again be made in any number 

3. A length of value indicating the total number of words of ways, FIG. 12 shows an exemplary occurrence factor 
included within the relevant term; table 220, utilizing which an adjusted count value for the 

4. A Part of Speech indication indicating the parts of 20 relevant term may be determined. The occurrence factor 
speech that words included within the term comprise (e.g., table 220 is shown to include values for various binding 
nouns, verbs, adjectives, or adverbs); and strength/term weight value combinations. The adjusted 

5. A Type indication indicating whether the term com- count value is indicative of the importance or relevance of 
prises a universal lexicon term, an environment lexicon term within a single, given document, and does not consider 
term, or is of unknown grammatical structure. 25 the importance or relevance of the term in view of any 

At step 184, a "binding strength", indicative of how occurrences of the term in other electronic documents that 

closely the term is coupled to the electronic document under may be associated with a particular user, 

consideration, is determined. While this determination may At step 190, a determination is made as to whether any 

be made in any number of ways, FIG. 10 shows an exem- adjusted coimt values exists for the relevant term as a result 

plary term-document binding table 200, utilizing which a 30 of the occurrence of the term in previously received and 

class may be assigned to each of the extracted terms. analyzed documents. If so, the adjusted count values for 

Specifically, the term-document binding table 200 is shown occurrences of the term in all such previous documents are 

to include three columns, namely a "number of occurrences'' summed. 

column 202, a density column 204, and an assigned ^class At step 192, an initial confidence level values for the term 
column 206. A term having a density value of greater than 35 is then determined based on the summed adjusted counts and 
four percent, for example, is identified as falling in the "A'* the term weight, as determined above with reference to the 
class, a term having a density of between two and four weight table 210 shown in FIG. 11. To this end, FIG. 13 
percent is identified as falling in the "B" class, a term having illustrates a confidence level table 230, which includes 
a density of between one and two percent is identified as various initial confidence level values for various summed 
falling in the "C class, while a term having a density of 40 adjusted count/weight value combinations that may have 
between 0.5 and one percent is identified as falUng in the "D been determined for a term. For example, a term having a 
class. For the terms having a density of above 0.5 percent, summed adjusted count of 0.125, and a weight value of 300, 
the density value is utilized to assign a class. For terms may be allocated an initial confidence level value of 11.5. 
which have a density value less than 0.5 percent, the count Following the determination of an initial confidence level 
value is utilized for this purpose. Specifically, a term having 45 value, confidence level values for various terms may be 
a count value of greater than 3 is assigned to the "E" class, grouped into "classes", which stUl retain cardinal meaning, 
and a term having a count value of between 1 and 3 is but which standardize the confidence levels into a finite 
assigned to the "F" class. Accordingly, the assigned class is number of "confidence bands". FIG. 14 illustrates a modi- 
indicative of the "binding strength" with which the term is fled table 240, derived firom the confidence level table 230, 
associated with or coupled to the electronic document under 50 wherein the initial confidence levels assigned are either 
consideration. rounded up or rounded down to certain values. By grouping 
At step 186, a characteristic (or qualitative) indicator in into classes by rounding, applications (like e-mail 
the form of a term weight value is determined, based on addressing), can make use of the classes without specific 
characteristics qualities of the term such as those represented knowledge/dependence on the numerical values. These can 
by the Type and Part of Speech indications discussed above. 55 then be tuned without impact to the applications. The 
While this determination may again be made in any number modified confidence level values included within the table 
of ways, FIG. 11 shows an exemplary weight table 210, 240 may have significance in a number of applications. For 
utilizing which a weight value may be assigned to each of example, users may request that terms with a confidence 
the extracted terms. Specifically, the weight table 210 is level of greater than 1000 automatically be published in a 
shown to include four columns, namely a weight column 60 "public" portion of their user knowledge profile. Further, 
212, a type column 214, a length column 216 and a Part of e-mail addressees for a particular e-mail may be suggested 
Speech column 218. By identifying an appropriate combi- based on a match between a term in the e-mail and a term 
nation of type, length and Part of Speech indications, an within the user knowledge profile having a confidence level 
appropriate term weight value is assigned to each term. In value of greater than, merely for example, 600. 
the type column 214, a type "F' indication identifies an 65 The method 154 then terminates at step 194. 
environment lexicon term, a type "L" indication identifies a In a further embodiment of the present invention, the 
universal lexicon term, and a type "U" indication identifies method 154, illustrated in FIG. 9A, may be supplemented by 
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a number of additional steps 195, as illustrated in FIG. 9B, user, responsive to a specific request and without specific 
by which a "document weight" value is assigned to a authorization from the target user. Restricted access, on the 
document based on addressee information associated with other hand, may require specific authorization by the target 
the document. The document weight value may be utilized user for the provision of informatioD concerning the user 

in any one of the steps 182-192 iUustrated in FIG. 9A, for 5 knowledge profile, and the target user, in response to a 
example, as a multiplying factor to calculate a confidence specific request. The method 250 commences at step 252^ 
level value for a term. In one exemplary embodiment, the and then proceeds to step 254, where a determination is 
binding strength value, as determined at step 184, may be made regarding the confidence level value assigned to a 
multiplied by the document weight value. In another exem- term, for example using the method 154 described above 

plary embodiment, the term weight value, as determined at lO with reference to FIG. 9A. Having determined the confi- 
step 186, may be multiplied by the document weight value. ^ence level value, the method 250 proceeds to step 256, 
The document weight value may be calculated by the where a threshold value is determined. The threshold value 
confidence logic 45 within the term extractor 46. Referring may either be a default value, or a user specified value, and 
to FIG. 9B, at step 196, the confidence logic 45 idenUfies the ^ utilized to categorize the relevant term. For example, users 

actual addressee information. To this end, the term extractor 15 may set the threshold through the browser interface as a 
46 may include a header parser (not shown) that extracts and fundamental configuration for their profile. If set low, the 
identifies the relevant addressee information. At step 197, user profile will be aggressively published to the public side, 
the confidence logic 45 then accesses a directory structure If set high, only terms with a high level of confidence will 
that may be maintained by an external communication be published. Users can also elect to bypass the threshold 

program for the purposes of determining the level of senior- 20 Publishing concept altogether, manuaUy reviewing each 
ity within an organization of the addressees associated with ^erm that crosses the threshold (via the notification manager) 
the document. In one exemplary embodiment of the and then deciding whether to publish. At decision box 258, 
invention, the directory structure may be a Lightweight a determination is made as to whether the confidence level 
Directory Access Protocol (LDAP) directory maintained by value for the term is less than the threshold value. If so, this 

a groupware server, such as Microsoft Exchange or Lotus 25 may be indicative of a degree of uncertainty regarding the 
Notes. At step 198, a cumulative seniority level for the term as being an accurate descriptor of a user's knowledge, 
various addressees is determined by summing seniority Accordingly, at step 260, the relevant term is then stored in 
values for each of the addressees. At step 199, the summed the "private" portion of the user knowledge profile, 
seniority value is scaled to generate the document weight Alternatively, should the confidence level value be greater 

value. In this embodiment, the cumulative or summed 30 than the threshold value, this may be indicative of a greater 
seniority level of the various addressees comprises an "aver- degree of certainty regarding the term as an accurate descrip- 
age" seniority value that is used for the purpose of calcu- tor of a user's knowledge, and the relevant term is then 
lating the document weight term. Altematively, instead of stored in the "public" portion of the user's knowledge profile 
summing in the seniority values at step 198, a "peak'^ at step 262. The method 150 then terminates at step 264. 

seniority value (i.e., a seniority value based on the seniority 35 FIG. 16A shows an exemplary user-tenn table 112, con- 
level of the most senior addressee) may be identified and stnicted according to the method 250 illustrated in FIG. 
scaled at step 199 to generate the document weight value. 15A. Specifically, die table 112 is shown to include a first 
In alternative embodiments, the addressee information user knowledge profile 270 and a second user knowledge 
may be utilized in a different manner to generate a document profile 280. The first user knowledge profile 270 is shown to 

weight value. Specifically, a document weight value may be 40 include a "public" portion 272, and a "private" portion 274, 
calculated based on the number of addressees, with a higher the terms within the "private" portion 274 having an 
number of addressees resulting in a greater document weight assigned confidence level value (as indicated in the confi- 
value. Similarly, a document weight value may be calculated dence level column 118) below a threshold value of 300. The 
based on the number of addressees who are included within second user knowledge profile 280 similarly has a "public" 

a specific organizational boundary (e.g., a specific depart- 45 portion 282 and a "private" portion 284. 
ment or division). For example, an e-mail message The exemplary user- term table 112 shown in FIG. 16A 
addressed primarily to an executive group may be assigned comprises an embodiment of the table 112 in which the 
a greater document weight value than an e-mail message public and private portions are determined dynamically with 
addressed primarily to a group of subordinates. Further, the reference to a confidence level value assigned to a particular 

document weight value may also be calculated using any so user- term pairing. FIG. 16B illustrates an alternative 
combination of the above discussed addressee information embodiment of the user-term table 112 that includes a 
characteristics. For example, the document weight value "private flag" column 119, within which a user-term pairing 
could be calculated using both addressee seniority and may be identified as being either public or private, and 
addressee niunber information. accordingly part of either the public or private portion of a 

55 specific user profile. While the state of a private flag asso- 
ciated with a particular user-term pairing may be determined 
exclusively by the confidence level associated with the 
FIG. 15A is a flow chart illustrating a method 250, pairing, in an alternative embodiment of the invention, the 
according to one exemplary embodiment of the present state of this flag may be set by other mechanisms. For 

invention, of constructing a user profile that includes first 60 example, as described in further detail below with reference 
and second portions that may convenientiy be identified as to FIG. 17E, a user may be provided with the opportunity 
"private" and "public" portions. Specifically, unrestricted manually to modify the private or public designation of a 
access to the "public" portion of the user knowledge profile term (i.e., move a term between the public and private 
may be provided to other users, while restricted access to the portions of a user knowledge profile). A user may be 

"private" portion may be facihtated. For example, unre- 65 provided with an opportunity to modify the private or public 
stricted access may encompass allowing a user to review designation of a term in response to a number of events, 
details concerning a user knowledge profile, and the target Merely for example, a tiser may be prompted to designate a 
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term as public in response to a "hit" upon a term in the 
private portion during a query process, such as during an 
"expert-lookup" query or during an "addressee-lookup" 
query. When storing the term in the user knowledge profile 
at either steps 260 or 262, the allocation of the term to the 
appropriate portion may be made by setting a flag, associ- 
ated Avith the term, in the "private flag" column 119 within 
the user-term table 112, as illustrated in FIG. 16B. For 
example, a logical "1" entry within the "private flag" column 
119 may identify the associated term as being in the "pri- 
vate" portion of the relevant user knowledge profile, while 
a logical "0" entry within the "private flag** column 119 may 
identify the associated term as being in the "public" portion 
of the relevant user knowledge profile. 

FIG. 15B illustrates an exemplary method 260/262, 
f^) according to one embodiment of the present invention, of 
storing a term in either a public or private portion of a user 
knowledge profile. Specifically, a respective term is added to 
a notification list at step 1264, following the determination 
made at decision box 258, as illustrated in FIG. ISA. At-^ 
decision box 1268, a determination is made as to whether a io 
predetermined number of terms have been accumulated / 
within the notification list, or whether a predetermined time / 
period has passed. If these conditions are not met, the/ 
method waits for additional terms to be added to the notiV 
fication list, or for further time to pass, at step 1266, before 
looping back to die step 1264. On the other hand, should a 
condition within the decision box 1268 have been met, thej 
method proceeds to step 1270, where the notification list,| 
that includes a predetermined number of terms that are to be\ 
added to the user knowledge profile, is displayed to a user-^^^ 
The notification list may be provided to the user in the formw 
of an e-mail message, or alternatively the user may be 
directed to a web site (e.g., by a URL included within e-mail 
message) that displays the notification list. In yet a further 
embodiment, the notification list may be displayed on a web 
or intranet page that is frequently accessed by the user, such 
as a home page. At step 1272, the user then selects terms that 
are to be included in the pubUc portion of the user knowl- 
edge profile. For example, the user may select appropriate 
buttons displayed alongside the various terms within the 
notification list to identify terms for either the pubHc or 
private portions of the user knowledge profile. At step 1274, 
private flags, such as those contained within the "private 
flag** column 119 of the user- term table 112 as shown in FIG. 
16B, may be set to a logical zero "0" to indicate that the 
terms selected by the user are included within the public 
portion. Similarly, private flags may be set to a logical one 
"1" to indicate terms that were not selected by the user for 
inclusion within the public portion arc by default included 
within the private portion. It will of course be appreciated 
that the user may, at step 1272, select terms to be included 
within the private portion, in which case un-selected terms i 
will by default be included within the public portion. The J 
method then ends at step 1280. 

The above described method is advantageous in that a 
user is not required to remember routinely to update his or ^ 
her user profile, but is instead periodically notified of terms 
that are candidates for inclusion within his or her user 
knowledge profile. Upon notification, the user may then 
select terms for inclusion within the respective public and 
private portions of the user knowledge profile. As such, the 
method may be viewed as a "push** model for profile 
maintenance. 
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profile, it wiU readily be appreciated that the method may be 
dynamically implemented as required and in response to a 
specific query, with a view to determining whether at least 
a portion of a user knowledge profile should be published, 
or remain private responsive to the relevant query. To this 
end, FIG. 17A shows a flow chart illustrating a method 300, 
according to one exemplary embodiment of the present 
invention, of faciUtating access to a user knowledge profile. 
The method 300 commences at step 302, and then proceeds 
to step 304, where a threshold value is determined. At step 
306, a document term within an electronic document gen- 
erated by a user (hereinafter referred to as a "query" user) is 
identified. Step 306 is performed by the term extractor 46 
responsive, for example, to the receipt of an e-mail from the 
mail system interface 42 within the knowledge gathering 
system 28. At step 308, comparison logic 51 within the term 
extractor 46 identifies a knowledge term within the reposi- 
tory 50 corresponding to the document term identified at 
step 306. The comparison logic 51 also determines a con-"^ 
fidence level value for the identified knowledge term. At 
decision box 310, the go gparison logic 51 makes a deter-^ 
mination as to whether the confidence level value for the 
knowledge term identified at step 308 is less than the 
threshold value identified at step 304, If not (that is the 
confidence level value is greater than the threshold value) 
then a public profile process is executed at step 312. 
Alternatively, a private profile process is executed at step 
314 if the confidence level value falls below the threshold 
value. The method 300 then terminates at step 316. ^ 
FIG. 17B shows a flowchart illustrating an alternative 
method 301, according to an exemplary embodiment of the 
present invention, of facilitating access to a user knowledge 
profile. The method 301 commences at step 302, and then 
proceeds to step 306, where a document term within an 
electronic document generated by a user (i.e., the "query" 
user) is identified. The term extractor 46 performs step 306 
responsive, for example, to the receipt of an e-mail message 
from the mail system interface 42 within the knowledge 
gathering system 28. At step 308, the comparison logic 51 
within the term extractor 46 identifies a knowledge term 
within the knowledge repository 50 corresponding to the 
document term identified at step 306. At decision box 311, ' 
the comparison logic 51 t hen makes a dctermin atioiL_as to 
lylietherir"p nvatP~flag~f or the knowledge term is set to 
indicate tlielrelevanTknowleHgc term as being either in the 
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METHOD OF ACCESSING A USER 
KNOWLEDGE PROFILE 

the above method 250 is described as being 
executed at the time of constmction of a user knowledge 



<^Cr) While 



p^lic or the pnyate po rtign of a user knowled ge^profile^ 
Specifically, the comparison logic 51 may examine the . 
content of an entry in the private flag column 112 of a 
user-term table for a specific user- term pairmg of which the 
knowledge term is a component. If the "private" flag for the 
knowledge term is set, thus indicating the knowledge term 
as being in the private portion of a user knowledge profile, 
the private profile process is executed at step 314. 
Alternatively, the public profile process is executed at step 
312. The method 301 then terminates at step 316. 

FIG. 17C shows a flow chart detaiUng a method 312, 
according to an exemplary embodiment of the present 
invention, of performing the public profile process men- 
tioned in FIGS. 17Aand 17B. The method 312 commences 
at step 320, and user information, the knowledge term 
corresponding to the document term, and the confidence 
level value assigned to the relevant knowledge term arc 
retrieved at steps 322, 324, and 326. This information is then 
displayed to the query user at step 328, whereafter the 
65 method 312 terminates at step 330. 

{ Atf^lG. 17D shows a flow chart detailing a method 314, 
^ ^according to an exemplary embodiment of the present 
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invention, of performing the private profile process men- whether the user has elected to delete any of the terms 

tioned in FIGS. 17A and 17B. The method 314 commences presented at step 364. If so, the relevant terms are deleted 

at step 340, and proceeds to step 342, where a user (herein from the user knowledge profile at step 374. The method is 

after referred to as the "target'' user) who is the owner of the then terminates at step 378. 

knowledge profile against which the hit occurred is notified 5 The methodologies described above with reference to 

of the query hit. This notification may occur in any one of pjcs. 15 through 17E are advantageous in that, where the 

a number of ways, such as for example via an e-mail confidence level of a term falls below a predetermined 

message. Such an e-mail message may further mclude a threshold, the owner of the user knowledge profile may elect 

URL pointing to a network location at which further infor- ^o be involved in the process of determining whether a query 

mation regarding the query hit, as well as a number of target 10 ^it is accurate or inaccurate. The owner of the user koowl- 

user options, may be presented. At step 346, the reasons for ^^jg^ ^^^^^^ ^Iso afforded the opportunity to update and 

the query hit are displayed to the target user. Such reasons modify his or her knowledge profile as and when needed, 

may include, for example, matching, or similar, document Further, the owner of the user knowledge profile is only 

and knowledge terms utihzing which the hit was identified engaged in the process for hits below a predetermined 

and the confidence level value associated with the knowl- 15 certainty level and on a pubhc portion of the knowledge 

edge term. These reasons may furthermore be presented pro^iQ, Matches between document terms and knowledge 

within the e-mail propagated at step 342, or at the network terms in the pubhc portion are automaticaUy processed, 

location identified by the URL embedded within the e-mail. without any manual involvement. 
At step 348, the target user then exercises a number of target 

user options. For example, the target user may elect to reject 20 METHOD FOR ADDRESSING AN 

the hit, accept the hit, and/or modify his or her user knowl- ELECTRONIC DOCUMENT FOR 

edge profile in hght of the hit. Specifically, the target user TRANSMISSION OVER A NETWORK 

may wish to "move" certain terms between the public and „ . . . ^ . * .1. • . oa 

private portions of the user knowledge profile. Further, the Returning now bnefly to FIG. 5, the addressmg system 84 

user may optionally delete certain terms from the user 25 T^l'' the e-mail chent extensions 19 operates indepen- 

knowledge profile in order to avoid any farther occurrences ^^^^J P^^f ^"^S ^y^''^ *° ^^'^^ P^^^^f 

of hits on such temis. These target user options may fiir- ^-"^^^ "^'"^^^f ^^^^ '^^^^^^^ '^^'^f^' 

ihermore be exercised via a HTML document at the network interface 80 within the e-mail chent extensioris 19 may 

location identified by the URL. At decision box 350, a W'l^P ^ T ^^'^ T'^'l determines such sugges- 

determination is made as to whether the user elected to 30 tion is possible, based on the lengOi of a draft menage bei^ 

modify the user knowledge profile. If so, a profile modifi- ^^^^ °' "^^^ ^^."^l^ ' command button labeled Suggest 

cation process, which is described below with reference to Recipients Tins button is user selectable to mitiate a 

FIG. 17E, is executed at step 352. Othenvise, a determina- sequence of operations whereby the author of the e-mail is 

tion is made at decision box 354 as to whether the target user P^^^^ed with a hst of potential recipients who may be 

rejected the hit. If so, the hit is de-registered at step 356. 35 "^f^^^.^^^^ ^^^^^^g ^'"^^'^ ^ased on predetermmed 

Alternatively, if the target user accepted the hit, the public ^"i^"^' ^^^^ ^^'"^ ^^^"^^^^ ^""^^j 

profile process described above with reference to FIG. 17C ^ P^^^^^' °' ^ commonality with a confirmed 

is executed at step 358. The method 314 then terminates at ^ ressee. 

step 360. F^G- is a flow chart illustrating a method 400, 

FIG. 17E is a flowchart illustrating a method 352, accord- 40 according to an exemplary embodiment of the present 

ing to an exemplary embodiment of the present invention, invention, of addressing an electronic document, such as an 

for implementing the profile modification process illustrated e-mail, for transmission over a network, such as the Internet 

at step 352 in FIG. 17D. The method 352 commences at step ^r an Intranet, The method 400 commences at step 402, and 

362, and then proceeds to display step 364, where the target proceeds to step 401, where a determination is made as 

user is prompted to (1) move a term, on which a "hit" has 45 whether the body of the draft message exceeds a prede- 

occurred, between the private and pubhc portions of his or ^ermined length (or number of words). If so, content of the 

her user knowledge profile, or to (2) delete the relevant term electromc document (e.g., an e-mail message body) is trans- 

from his or her user knowledge profile. Specifically, the ^^i^ted to the knowledge access server 26 via the web server 

target user may be presented with a user dialog, a HTML- ^0 at step 404. Specificahy, a socket connecUon is open 

enriched e-mail message, or a Web page, fisting the various so between the e-mail client 18 and the web server 20, and the 

terms upon which hits occurred as a result of an inquiry, ^"t^^i^ message body, which may still be m draft 

besides which appropriate buttons are displayed that allow f^^*^' transmitted using the Hypertext Transfer Protocol 

the user to designate the term either to the included in the P'TTP) via the web server 20 to the knowledge access 

public or private portion of his or her user knowledge server 26. At step 406, the knowledge access server 26 

profile, or that allow the user to mark the relevant term for 55 P^cesses the message body, as wiU be described in further 

deletion from the user knowledge profile. At input step 366, detail below. At step 408, the knowledge access server 26 

the target user makes selections regarding the terms in the transmits a potential or proposed recipient list and associated 

matter described above. At decision box 368, a determina- information to the addressing system 84 of the e-mail cUent 

tion is made as to whether the user selected terms for transfer Specifically, the mformaUon transmitted to the e-mail 

between the public and private portions of the user profile, 60 ^li^°^ "*^y ^^^^^^^ followmg: 

or for inclusion within the user profile. If so, the method 352 1 ■ A list of user names, as listed within column 94 of the 

proceeds to step 370, wherein the appropriate terms are user table 90, as well as corresponding e-mail 

designated as being either public or private, in accordance addresses, as listed within the column 98 of the user 

with the user selection, by setting appropriate values in the table 90; 

"private flag" column 119 within the user-term table, as 65 2. A list of term identifiers, as listed in column 116 of the 
illustrated in FIG. 16B. Thereafter, the method proceeds to uscrterm table 112, that were located within the "pub- 
decision box 372, wherein a determination is made as to lie" portion of a user knowledge profile that formed the 
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basis for a match between document terms within the client 18 at step 414. Accordingly, the author user is thus 
message body and knowledge terms within the user able to ascertain the reason for the proposal of a potential 
knowledge profile; and recipient by the addressing system 84, and to make a more 



, A "matching metric" for each user included in the list n 
of user names (1). Each "matching metric" comprises 
the sum of the confidence level values, each multiplied 
by the weighted occurrences of the term within the 
message body, for the terms identified by the list of . 



informed decision as to whether the proposed recipient 
should be included within the actual recipients (confirmed 
addressee) list. 

TTic user also has the option of initiating a "More" 
function by selecting the "More" button 456 on the user 



term identi fiers ( Tj'md associated with the relevant^ dialog 440, this function serving to provide the user with 

user. This "matching metric" is indicative of the lO additional proposed recipients. Accordingly, a determination 

strength of the recommendation by the knowledge is made at step 422 as to whether the "More" function has 

access server 26 that the relevant user (i.e., potential been selected by the author user. If so, the method 400 

recipient) be included within the list of confirmed branches to step 424 as shown in FIG. 18C, where the client 

addressees. 18 propagates a "More" request to the knowledge access 

At step 410, the author of the electronic document is 15 server 20 in the same manner as the "Explain" query was 

presented with a list of potential recipients by the e-mail propagated to the knowledge access server at step 414. At 

client 18, and specifically by the addressing system 84 via a step 46, the knowledge access server 26 identifies further 

user dialog 440 as shown in FIG. 18D. FIG. 18D groups potential recipients, for example, by using a threshold value 

matching levels into matching classes each characterized by for the "matching metric" that is lower than a threshold 

a visual representation (icon). 20 value utilized as a cutoff during the initial information 

The user dialog 440 shown in FIG. 18D presents the list retrieval operation performed at steps 406 and 408. At step 

of potential recipients in a "potential recipients" scrolling 428, the knowledge access server 26 then transmits the list 

window 442, wherein the names of potential recipients are of further potential recipients, and associated information, to 

grouped into levels or ranked classes according to the the e-mail client 18. At step 450, the list of additional 

strength of the matching metric. An icon is also associated 25 potential recipients is presented to the author user for 

with each user name, and provides an indication of the selection in descending order according to the "matching 

strength of the recommendation of the relevant potential metric" associated with each of the potential recipients, 

recipients. Merely for example, a fully shaded circle may At step 432, the user then adds at his or her option, or 

indicate a high recommendation, with various degrees of deletes selected potential or "rejected" recipients to the list 

"blackening" or darkening of a circle indicating lesser 30 of actual recipients identified in "to:", "cc:" or "bcc:" lists of 

degrees of recommendation. A "rejection" icon may be the e-mail, thus altering the status of the potential recipients 

associated with an actual recipient, and an example of such to actual recipients. At step 434, the e-mail message is then 

a rejection" icon is indicated at 441. The "rejection" icon transmitted to the confirmed addressees, 

indicates a negative recommendation on an actual recipient If the user profile includes a "rejection" status on a term 

supplied by the author of the message, and may be provided 35 (something a user can do through manual modification of the 

in response to a user manually modifying his or her profile profile), then a special symbol, such as that indicated 441 in 

to designate certain terms therein as generating such a FIG. 18D, may be returned indicating a negative recom- 

"rejection" status for a recipient against which a hit occurs. mendation on a recipient supplied by the author of the 

The user dialog 440 also presents a list of actual (or message, 

confirmed) recipients in three windows, namely a "to:" 40 The exemplary method 400 discussed above is advanta- 

window 442, a "cc:" window 444 and a "bcc:" window 446. geous in that the knowledge access server 26 automatically 

An inquiring user may move recipients between the poten- provides the author user with a list of potential addressees, 

tial recipients list and the actual recipients lists utilizing the based on a matching between document terms identified 

"Add" and "Remove" buttons indicated at 450. The user within the message body of an e-mail and knowledge terms 

dialog 440 also includes an array of "select" buttons 452, 45 included within user profiles, 

utilizing which a user can determine the recommendation CASE CONTROL 
group to be displayed within the scrolling window 442. The 

user dialog 440 finally also includes "Explained Match" and FIG. 19 is a flow chart illustrating a method 500, accord- 

"More" buttons 454 and 456, the purposes of which is ing to one exemplary embodiment of the present invention, 

elaborated upon below. As shown in FIG. ISD, the author 50 of managing user authorization to publish, or permit access 

user may select an "Explain" function for any of the pro- to, a user knowledge profile. The method 500 is executed by 

posed recipients utilizing the "Explain Match" button 454. If the case controller 45 A that tracks open "cases" and initiates 

it is determined at decision box 412 that this "Explain" notification to users concerning the status of such cases. For 

function has been selected, the method 400 branches to step the purposes of the present specification, the term "case" 

414, as illustrated in FIG. 18B. Specifically, at step 414, the 55 may be taken to refer to a user authorization process for 

addressing system 84 propagates a further "Explain" query publication of, or access to, a user knowledge profile. The 

to the knowledge access server 26 utilizing HTTP, and opens method 500 commences at step 502, and then proceeds to 

a browser window within which to display the results of the step 504, where a match is detected with a private portion of 

query. At step 416, the knowledge access server 26 retrieves a user knowledge profile. At step 504, the case controller 

the terms (i.e., the knowledge terms) that constituted the 60 45A then opens a case, and notifies the target user at step 506 

basis for the match, as well as associated confidence level concerning the "hits" or matches between a document (or 

values. This information is retrieved from the public portion query) term and a knowledge term in a knowledge user 

of the relevant user knowledge profile in the knowledge profile. This notification may be by way of an e-mail 

repository 50. At step 418, the information retrieved at step message, or by way of publication of information on a Web 

416 is propagated to the client 18 from the knowledge access 65 page accessed by the user. At step 508, the case controller 

server 26 via the web server 20. The information is then 45A determines whether an expiration date, by which the 

displayed within the browser window opened by the e-mail target user is required to respond to the hit, has been reached 
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or in fact passed. If the expiration date has passed, the case value, or alternatively greater than a predetermined site- 
controller 45A closes the case and the method 500 tcrmi- wide or system-wide threshold value. If the confidence level 
nates. Altematively, a determination is made at decision box value is determined to be greater than the confidence 
510 as to whether the target user has responded to the memory value (or the threshold value), the confidence 
notificationby authorizing publication of, or access to, his or 5 memory value is then made equal to the confidence level 
her user knowledge profile based on the hit on the private value by overwriting the previous confidence memory value 
portion thereof. If the target user has not authorized such ^ith the newly calculated confidence level value. In this 
action (i.e., declined authorization), an inquiring user (e.g., ^^y^ '\ ^ ^"^^"^^ confidence level value does not 
the author user of an c-maU or a user performing a manual ^^^^^^ the confidence memory value, 
database search to locate an expert) is notified of the decline lo ^IG. 22 is an exemplary user-term table U2, according to 
at step 512. Alternatively, should the target user have autho- embodiment of the present inventioi^ that is shown to 
• J »A- ^- *u ■ • • •••11 include a confidence level column 118, a confidence 
rized publication or access, the inquiring user is similarly , , j.- * i ^-^i 
^■o J J .•/-.• r memory value column 121, and a time stamp column 123. 
notified of the authonzatioD at step 514. The notmcation of ^^^j^ ^^^^^^ ^ confidence level value and a 
the inquirmg user at steps 512 or 514 may be performed by confidence memory value for each user-term pairing within 
transmitting an e-mail to the mqmnng user, or by providing 15 ^^^^^ ^ ^^is table that the confidence level 
a suitable mdicaUon on a web page (e.g., a home page or values and the confidence memory values are written by the 
search/query web page) accessed by the inquiring user. At method 550. The time stamp column 123 records a date and 
step 516, the appropriate portions of the user profile per- time stamp value indicative of the date and time at which the 
taining to the target user are published to the inquiring user, corresponding confidence memory value was last updated, 
or the inquiring user is otherwise permitted access to the user 20 This value will accordingly be updated upon the overwriting 
profile. At step 518, the case controller 45 A then closes the of the confidence memory value at step 560. 
case, whereafter the method terminates. Should the confidence level value not exceed the confi- 

CTTDDT truvrcKFTAT xrfrTTTjr^r* rtc TPiii'xrrTTTVfXTr- dence memory value or the threshold value, as determined 

SUPPLEMENTAL METHOD OF IDENTIFTING ^ ^ • • l eeo ^t. *l j * en *u j ^ 

CONFIDENCE VALUE decision box 558, the method 550 then proceeds to 

25 decision box 562, where a further determination is made as 

FIGS. 7-9 describe an exemplary method 140 of identi- to whether another time or document window, associated 

fying knowledge terms and calculating associated confi- with a step of decaying the confidence memory value, has 

dence level values. A supplemental method 550, according expired. If not, the confidence memory value is left 

to an exemplary embodiment of the present invention, of unchanged at step 5 6 4. Alternatively, if the time or document 

assigning a confidence value to a term will now be described 30 window associated with the decay step has expired, the 

with reference to FIGS. 20-22. The supplemental method confidence memory value is decayed by a predetermined 

550 seeks to compensate for a low confidence level value value or percentage at step 566. For example, the confidence 

which may be associated with the term as a result of the term memory value may be decayed by five (5) percent per 

not appearing in any recent documents associated with a month. The time stamp value may be utilized to determine 

user. It wiU be appreciated that by calculating a confidence 35 the window associated with the decay step. The time stamp 

level value utilizing the method illustrated in FIG. 9, aged value associated with the decayed confidence memory value 

terms (i.e., terms which have not appeared in recent is also updated at step 566. The method 550 then terminates 

documents) may be attributed a low confidence level value at step 568. 

even though they may be highly descriptive of a specializa- FIG. 21 is a flowchart illustrating an exemplary method 

tion or knowledge of a user. The situation may occur where 40 570, according to one embodiment of the present invention, 

a user is particularly active with respect to a particular topic of determining or identifying a confidence value (e.g., either 

for a short period of time, and then rc-focuscs attention on a confidence level value or a confidence memory value) for 

another topic. Over time, the methodology illustrated in a term. The method 570 may be executed in performance of 

FIG. 9 may too rapidly lower the confidence level values any of the steps described in the preceding flow charts that 

associated with terms indicating user knowledge. 45 require the identification of a confidence level value for a 

Referring to FIG. 20, there is illustrated the exemplary term in response to a hit on the term by a document term 

method 550 of assigning a confidence value to a term. The (e.g., in an electronic document or other query). The method 

method 550 commences at step 552, whereafter an initial 570 commences at step 572, and proceeds to step 574, where 

confidence memory value (as distinct from a confidence a confidence level value for a term within a user profile is 

level value) is assigned a zero (0) value. At step 556, a 50 identified. For example, the confidence level value may be 

confidence level value for a term is calculated utilizing, for identified within be user- term table 112 illustrated in FIG. 

example, the method 154 iUustrates in FIG. 9. However, this 22. At step 576, a confidence memory value for the terra may 

confidence level value is only calculated for occurrences of then also be identified, again by referencing the user- term 

the relevant term within a particular time or document table 112 iUustrated in FIG. 22. At decision box 578, a 

window. For example, in summing the adjusted count values 55 determination is then made as to whether the confidence 

at step 190 within the method 154, the adjusted count values level value is greater than the confidence memory value. If 

for only documents received within a predetermined time the confidence level value is greater than the confidence 

(e.g., the past 30 days), or only for a predetermined number memory value, the confidence level value is returned, at step 

of documents (e.g., the last 30 documents) are utilized to 580, as the confidence value. Alternatively, should the 

calculate the summed adjusted count value. It will be eo confidence memory value be greater than the confidence 

appreciated that by discarding documents, which occurred level value, the confidence memory value is returned, at step 

before the time or document window, the effect on the 582, as the confidence value. The method 570 then termi- 

confidence level values for aged terms by the absence of nates at step 584. 

such aged terms within recent documents may be reduced. Accordingly, by controlling the rate at which a confidence 

At decision box 558, a determination is then made as to 65 value for a term is lowered or decayed, the present inventioo 

whether a newly calculated confidence level value for a term seeks to prevent having a potentiaUy relevant term ignored 

is greater than a previously recorded confidence memory or overlooked. 
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COMPUTER SYSTEM 5. The method of claim 1 including determining whether 

. J. ... c I.' ' the electronic document is indicated by the first user as being 

FIG, 23 IS a diagrammaUc representation of a machme m ^ construction of the user knowledge profile, 

the form of computer system 600 withm which software m ^ ^^^^^^ ^j^-^i 5 wherein the determining com- 
the form of a series of machine -readable inslmctions, for ^ p^ges examining a flag associated with the electronic 

performmg any one of the methods discussed above may be document, the flag marking the electronic document as 

executed. The computer system 600 includes a processor bgi^g foj- the construction of the user knowledge 

602, a main memory 603 and a static memory 604, which profile of the first user. 

communicate via a bus 606. The computer system 600 is 7. The method of claim 5 wherein the determining com- 

further shown lo include a video display unit 608 (e.g., a prises determining whether the electronic document is 

liquid crystal display (LCD) or a cathode ray tube (CRT)). addressed to an entity that performs the construction of the 

The computer system 600 also includes an alphanumeric user knowledge profile of the first user, 

input device 610 (e.g., a keyboard), a cursor control device 8. The method of claim 1 including determining whether 

612 (e.g., a mouse), a disk drive unit 614, a signal generation the electronic document indicates an attachment as being for 

device 616 (e.g., a speaker) and a network interface device use in the construction of the user knowledge profile of the 

618. The disk drive unit 614 accommodates a machine- user. 

readable medium 615 on which software 620 embodying 9. The method of claim 1 wherein automatically assigmng 

any one of the methods described above is stored. The the confidence level comprises identifying a term within the 

software 620 is shown to also reside, completely or at least '^^If^i; .jri nL • . .-n 

iu^ ™n;„ jcfti ,T,iiK;« ti,« 10. The method of claim 9 wherein automatically assign- 

parUally, within the main memory 603 and/or withm the . . ^ , i t • i j u c 
iCA-i TV ft A k» 20 mg the confidence level includes counting a number of 

processor 602. The software 620 may furthermore be trans- j -^u- »u * 

-..J - *u 1 • : r J • ifio t:> words Within the term, 

mitted or received by the network interface device 618^ For ^^^^^ ^^^^ ^ ^^^^^^^ automatically assign- 

the purposes of the present specification, the term machine- ^ confidence level includes identifying parts of speech 

readable medium" shall be taken to mclude any medium that comprising the term 

is capable of storing or encoding a sequence of instructions ^ 12. The method of claim 9 wherein automatically assign- 

for execution by a machine, such as the computer system confidence level includes comparing the term to a 

600, and that causes the machine to performing the methods collection of lexicon terms. 

of the present invention. The term "machine-readable 13. xhe method of claim 9 wherein automatically assign- 
medium" shall be taken to include, but not be Hmited to, ing the confidence level includes determining a frequency 
solid-state memories, optical and magnetic disks, and carrier with which the term occurs within the electronic document 
wave signals. associated with the first user. 

Thus, a method and apparatus for constructing and main- 14. The method of claim 13 wherein the determining of 
taining a user knowledge profile have been described. the frequency comprises detennioiug a percentage value 
Although the present invention has been described with indicating a frequency of occurrence of the term relative to 
reference to specific exemplary embodiments, it wHl be 3, ^ ''''' 
evident that various modifications and changes may be made ^^^^^^ method'ofcS wherein automatically assign- 
to these embodiment without departmg from the broader confidence level includes determining a frequency 
spint and scope of the mvention. Acco^gly, the specifi- ^^^^ ^^^^ ^.^^^ ^ electronic 
cation and drawmgs are to be regarded in an illustraUve documents associated with the first user, 
rather than a restnctive sense. ^ ^5 -j^^ ^^t^^^d of claim 9 wherein the term comprises 

What is claimed is: ^ group consisting of a grammar term, a noun 

1. A computer-implemented method of constructing a user phrase and a word 

knowledge profile, the method including: ^j^^thod of claim 1 wherein automatically assign- 
automatically assigning a confidence level to content j^g confidence level includes analyzing addressee infor- 
within an electronic document associated with a first 45 mation associated with the electronic document, 
user, the content being potentially indicative of a user ig j^^ method of claim 17 wherein the analyzing of the 
knowledge base of the first user; and addressee information includes identification of a level of 
storing the content in either a first or a second portion of seniority of an addressee of the electronic document, 
a user knowledge profile of the first user according to 19. Tlie method of claim 17 wherein the analyzing of the 
the assigned confidence level, 50 addressee information includes determining an average level 
wherein the first and the second portions of the user of seniority of all addressees of the electronic document, 
knowledge profile of the first user have different access 20. The method of claim 17 wherein the analyzing of the 
restrictions with respect to a second user. addressee information includes determining a level of 

2. The method of claim 1 wherein access to the first seniority of a most senior addressee of the electronic docu- 
portion of the user knowledge profile by the second user is 55 ment. 

unrestricted and the storing comprises storing the content in 21. The method of claim 18 including generating a 

the first portion if the confidence level exceeds a predeter- document weight term utilizing the level of seniority of the 

mined threshold. • addressee of the electronic document. 

3. The method of daim 2 wherein access to the second 22. The method of claim 21 including identifying a 
portion of the user knowledge profile by the second user is 60 number of addressees of the electronic document, 
restricted and the storing comprises storing the content in the 23. The method of claim 22 including generating a 
second portion if the confidence level is below the prede- document weight term utilizing the number of addressees of 
termined threshold. the electronic document. 

4. The method of claim 1 including designating the first 24. The method of claim 17 wherein the analyzing of the 
portion of the user knowledge profile of the first user as a 65 addressee information includes identification of an organi- 
public portion and designating the second portion of the user zational group of at least one addressee of the electronic 
knowledge profile of the first user as a private portion. document. 
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25. The method of claim 23 including generating a 
documeat weight term utilizing the organizational group of 
the at least one addressee of the electronic document. 

26. The method of claim 1 wherein the storing of the 
content includes notifying the first user of the content, prior 5 
to storing the content in either the first or second portion of 
the user knowledge profile of the first user. 

27. The method of claim 26 wherein the notifying 
includes providing the first user with an option of storing the 
content in either the first or second portion of the user 
knowledge profile of the first user. 

28. The method of claim 26 wherein notifying the first 
user includes determining whether the confidence level 
automatically assigned to the content exceeds a predeter- 
mined threshold, and notifying the first user of the content 
when the confidence level exceeds the predetermined thresh- 
old. 

29. The method of claim 1 wherein the electronic docu- 
ment comprises an electronic mail message generated by the 
first user. 20 

30. The method of claim 1 wherein the electronic docu- 
ment comprises an electronic document attached to an 
electronic mail message transmitted by the first user. 

31. The method of claim 1 wherein the electronic docu- 
ment comprises a database inquiry generated by the first 25 
user. 

32. The method of claim 1 wherein the electronic docu- 
ment comprises any one of a group consisting of a user 
profile document and a list of bookmarks, folders and 
directories generated by the first use. 3Q 

33. A data processing system for constructing a user 
knowledge profile, the system comprising: 

confidence logic to examine an electronic document, 
associated with a first user, and to assign a confidence 
level to content within the electronic document, the 35 
content being potentially indicative of a user knowl- 
edge base of the first user; and 

a profiler to store the content in either the first or second 
portion of the user knowledge profile of the first user 
according to the assigned confidence level, 40 

wherein the first and the second portions of the user 
knowledge profile of the first user have different access 
restrictions with respect to a second xiser. 

34. The system of claim 33 wherein access to the first 
portion of the user knowledge profile by the second user is 45 
unrestricted, and the profiler stores the content in the first 
portion if the confidence level exceeds a predetermined 
threshold. 

35. The system of claim 34 wherein access to the second 
portion of the user knowledge profile by the second user is so 
restricted, and the profiler stores the content in the second 
portion if the confidence level is below the predetermined 
threshold. 

36. The system of claim 33 including a client to propagate 
the electronic document to the confidence logic. ss 

37. The system of claim 33 wherein the client is an e-maU 
client program. 

38. The system of claim 33 wherein the confidence logic 
determines whether the electronic document is indicated by 
the first user as being for use in the construction of the user 60 
knowledge profile. 

39. The system of claim 38 wherein the confidence logic 
examines a flag associated with the electronic document, the 
flag marking the electronic document as being for use in the 
construction of the user knowledge profile of the first user. 65 

40. The system of claim 38 wherein the confidence logic 
determines whether the electronic document is addressed to 
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an entity that performs the construction of the user knowl- 
edge profile of the first user. 

41. The system of claim 33 wherein the confidence logic 
determines whether the electronic document indicates an 
attachment as being for use in the construction of the user 
knowledge profile of the first user. 

42. The system of claim 33 wherein the confidence logic 
identifies a term within the content. 

43. The system of claim 42 wherein the confidence logic 
counts a number of words within the term. 

44. The system of claim 42 wherein the confidence logic 
identifies parts of speech comprising the term. 

45. The system of claim 42 wherein the confidence logic 
compares the term to a coUection of lexicon terms. 

46. The system of claim 42 wherein the confidence logic 
determines a frequency with which the term occurs within 
the electronic document associated with the first user. 

47. The system of claim 46 wherein the confidence logic 
determines a percentage value indicating a frequency of 
occurrence of the term relative to a total number of terms 
within the electronic document associated with the first user. 

48. The system of claim 42 wherein the confidence logic 
determines a firequency with which the term occurs within a 
plurahty of electronic documents associated with the first 
user. 

49. The system of claim 42 wherein the term comprises 
any one of a group consisting of a grammar term, a noun 
phrase and a word. 

50. The system of claim 33 wherein the confidence logic 
analyzes addressee information associated with the elec- 
tronic document. 

51. The system of claim 50 wherein the confidence logic 
identifies a level of seniority of an addressee of the elec- 
tronic document. 

52. The system of claim 50 wherein the confidence logic 
determines an average level of seniority of all addressees of 
the electronic document. 

53. The system of claim 50 wherein the confidence logic 
determines a level of seniority of a most senior addressee of 
the electronic document. 

54. The system of claim 50 wherein the confidence logic 
generates a document weight term utilizing the level of 
seniority of the addressee of the electronic document. 

55. The system of claim 33 wherein the confidence logic 
identifies the number of addressees of the electronic docu- 
ment. 

56. The system of claim 55 wherein the confidence logic 
generates a document weight term utilizing the number of 
addressees of the electronic document. 

57. The system of claim 33 wherein the confidence logic 
identifies an organizational group of at least one addressee 
of the electronic document. 

58. The system of claim 57 wherein the confidence logic 
generates a document weight term utilizing the organiza- 
tional group of the at least one addressee of the electronic 
document. 

59. The system of claim 33 including a notifier that 
notifies the first user of the content, prior to the content being 
stored in either the first or the second portion of the user 
profile of the first user. 

60. The system of claim 59 wherein the notifier provides 
the first user with an option of storing the content in either 
the first or the second portion of the user knowledge profile 
of the first user. 

61. The system of claim 59 wherein the notifier deter- 
mines whether the confidence level automatically assigned 
to the content by the confidence logic exceeds a predeter- 
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mined threshold, and notifies the first user of the content 
when the confidence level exceeds the predetermined thresh- 
old. 

62. The system of claim 33 wherein the electronic docu- 
ment comprises an electronic mail generated by the first 
user. 

63. The system of claim 33 wherein the electronic docu- 
ment comprises a document attached to an electronic mail 
message transmitted by the user. 

64. The system of claim 33 wherein the electronic docu- 
ment comprises a database inquiry generated by the first 
user. 

65. The system of claim 33 wherein the electronic docu- 
ment comprises any one of a group consisting of a user 
profile document and a list of bookmarks, folders, and 
directories generated by the first user. 

66. A data processing system for constructing a user 
knowledge profile, the system comprising: 

confidence means for examining an electronic document, 
associated with a first user, and for assigning a confi- 
dence level to content within the electronic document, 
the content being potentially indicative of a user knowl- 
edge base of the first user; and 
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profiling means for storing the content in either the first or 
the second portion of the user knowledge profile of the 
first user according to the assigned confidence level, 

wherein the first and the second portions of the user 
knowledge profile of the first user have different access 
restrictions with respect to a second user. 

67. A computer-readable medium for use by a computer 
for constructing a user knowledge profile storing a sequence 
of instructions that, when executed by a computer, cause the 
computer to perform the steps of: 

automatically assigning a confidence level to content 
within an electronic document associated with a first 
user, the content being potentially indicative of a user 
knowledge base of the first user; and 

storing the content in either the first or second portion of 
a user knowledge profile of the first user according to 
the assigned confidence level, 

wherein the first and the second portions of the user 
knowledge profile of the first user have different access 
restrictions with respect to a second user. 
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